Outliers, Extreme Events and Multiscaling 
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Extreme events have an important role which is sometime catastrophic in a variety of natural 
phenomena including climate, earthquakes and turbulence, as well as in man-made environments 
like financial markets. Statistical analysis and predictions in such systems are complicated by the 
fact that on the one hand extreme events may appear as "outliers" whose statistical properties do 
not seem to conform with the bulk of the data, and on the other hands they dominate the (fat) 
tails of probability distributions and the scaling of high moments, leading to "abnormal" or "multi"- 
scaling. We employ a shell model of turbulence to show that it is very useful to examine in detail the 
dynamics of onset and demise of extreme events. Doing so may reveal dynamical scaling properties 
of the extreme events that are characteristic to them, and not shared by the bulk of the fluctuations. 
As the extreme events dominate the tails of the distribution functions, knowledge of their dynamical 
scaling properties can be turned into a prediction of the functional form of the tails. We show that 
from the analysis of relatively short time horizons (in which the extreme events appear as outliers) 
we can predict the tails of the probability distribution functions, in agreement with data collected 
in very much longer time horizons. The conclusion is that events that may appear unpredictable 
on relatively short time horizons are actually a consistent part of a multiscaling statistics on longer 
time horizons. 



I. INTRODUCTION 



There is an obvious and wide spread interest in predict- 
ing extreme events in a variety of contexts. Particularly 
well known examples are the insurance risks related to 
large tropical storms, human and property risks in the 
context of large earthquakes, financial risks caused by 
large movements of the markets, and dangers to passen- 
ger planes due to extremely intermittent turbulent air ve- 
locities. Obviously, any improvement in the predictabil- 
ity of any of these extreme events is highly desirable for 
a number of reasons. Accordingly, there exists a large 
body of work focusing on the statistics of such events, 
small, intermediate and large, with the aim of studying 
the ensuing probability distribution functions (PDF). If 
one can model properly the PDF, one can in principle 
predict at least the frequency of extreme events. Yet, 
there is one fundamental question that arises that needs 
to be confronted first: are the extreme events sharing the 
same statistical properties as the small and intermediate 
events, or are they "outliers"? If the latter is true, then 
no analysis of the core of the PDF, clever as it may be, 
could yield a proper answer to the desire to predict the 
probability of extreme events. 

Indeed, in a number of context it had been proposed 
recently that extreme events are "outliers" Q . For exam- 
ple in financial markets the largest draw-downs appear to 
exhibit properties that differ from the bulk of the fluctu- 
ations Ig]. In general one would refer to "outliers" when 
the rate of occurrence of small and intermediate events 
lies on a PDF with some given properties, while the ex- 
treme events appear to exhibit statistical properties that 
differ from the bulk in a significant way. The aim of this 
paper is to present a detailed analysis of the fluctuations 
in a turbulent dynamical system that shows that such a 
point of view can be substantiated . Clearly, this type of 



considerations must be conducted with great care. The 
danger is that on small time horizons the largest events 
appear so rarely, once or twice, that their rate of occur- 
rence is not statistically significant, and no conclusion 
about their relation to the statistics of small and interme- 
diate events is possible. Nevertheless, we offer in this pa- 
per a positive outlook. We will show that in the context 
of the bulk of this paper, which is the analysis of a shell 
model of turbulence, one can analyze within the short 
time horizon the dynamics of the extreme events. This 
analysis reveals their special dynamical scaling proper- 
ties, allowing us to make interesting predictions about 
the tails of the distribution functions even before the full 
statistics is available. These predictions can be checked 
in our case by considering much longer time horizons. 
The conclusion for the extreme events community is that 
it may very well pay to look very carefully at the de- 
tailed dynamics of the extreme events if one wants to 
claim anything about their probability of occurrence. 

The model that we treat in detail in this paper is a so- 
called "shell" model of turbulence. Shell models of turbu- 
lence l^-g] are simplified caricatures of the equations of 
fluid mechanics in wave-vector representation; typically 
they exhibit anomalous scaling even though their non- 
linear interactions are local in wavenumber space. The 
wavenumbers are represented as shells, which are chosen 
as a geometric progression 



M", 



(1) 



where A is the "shell spacing". There are N degrees of 
freedom where N is the number of shells. The model 
specifies the dynamics of the "velocity" u„ which is con- 
sidered a complex number, n — 1, . . . , N. Their main ad- 
vantage is that they can be studied via fast and accurate 
numerical simulations, in which the values of the scaling 
exponents can be determined very precisely. We employ 



our own home-made shell model which had been chris- 
tened the Sabra model [g|. It exhibits similar anomalies 
of the scaling exponents to those found in the previously 
popular GOY model [^Jj], but with much simpler cor- 
relation properties, and much better scaling behavior in 
the inertial range. The equations of motion for the Sabra 
model read: 

—p- = i{akn+iU„+2Un+i + bknUn+iul_i (2) 

-ckn-lUn-lUn-2) - l^k^Un + fn , 

where the star stands for complex conjugation, /„ is a 
forcing term which is restricted to the first shells and i^ 
is the "viscosity". In this paper we restrict the forcing 
to the first and and second shells only {n = 1,2). The 
coefRcients a, b and c are chosen such that 



a+b+c^O 



(3) 



This sum rule guarantees the conservation of the "en- 
ergy" 



^-E 



w„P , 



(4) 



in the inviscid (v — 0) limit. 

The main attraction of this model is that it displays 
multiscaling in the sense that moments of the velocity 
depend on fc„ as power laws with nontrivial exponents: 



^pyl^n) 



\P) ex fc-^p oc A"""^" 



(5) 



where the scaling exponents ^p exhibit non linear depen- 
dence on p. We expect such scaling laws to appear in the 
"inertial range" with shell index n larger than the largest 
shell index that is effected by the forcing, denoted as ul , 
and smaller than the shell indices affected by the viscos- 
ity, the smallest of which will be denoted as Ud- The 
scaling exponents were determined with high numerical 
accuracy better than 0.02 in Ref. pj. 

To introduce the issue behind the title of this paper, 
we present in Fig. |^ a typical time series for mh. The pa- 
rameters of the model are detailed in the figure legend. 
One can see the typical appearance of rare events with 
amplitude that exceeds the mean by a factor of 6-8. To 
pose the question in its clearest way we display in Fig. || 
a distribution function which is the normalized rate of 
occurrence (i.e. the number of times) that a given ampli- 
tude has been observed in the time window of 10^ time 
steps. This apparent relative frequency of events is very 
similar to findings in real data, see for example Fig. 1 of 
Ref. [^ . which deals with draw downs in the Dow Jones 
Average. Similarly to the analysis there, we can pass an 
approximate straight line through the points represent- 
ing small and intermediate events. Such an exponential 
law would mean that the events of [unl^ with amplitudes 
larger than, say, 4(|miiP) are clear outliers. Their prob- 
ability is so low that they should not have appeared in 



the short time horizon at all. We could conclude, like in 
the analysis of Ref. 0, that the extreme events cannot 
be dealt with the same distribution function as the small 
and intermediate events. 




O.Oe+00 2.0e+06 4.0e+06 6.0e+06 8.0e+06 1.0e+07 
timesteps 
FIG. 1. Time series for normalized velocity of the 11th 
shell. Parameters of the numerics; a = 1, b = c = —0.5, 
A = 2, A'^ = 28, fco = 1/64, time correlated random forcing on 
the first two shells with characteristic amplitude 0.005(1 -l-i). 
Decorrelation time is chosen about turnover time of the 1st 
shell. 
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FIG. 2. Apparent probability distribution function for 11th 
shell. Averaging over 10^ time-steps which is about 250 decor- 
relation times for this shell. The data contains additional, 
extremely sparse events of amplitude larger than 7, occurring 
once each; these were left out in this plot. 



On the other hand, it is very possible that the low rate 
of occurrence of the extreme events in Fig. y means sim- 
ply that they are statistically irrelevant and that no con- 
clusion can be drawn. How to overcome this difficulty? 
The purpose of this paper is to show that indeed the ex- 
treme events may have dynamical scaling properties that 
are all their own, and that they aifect crucially the tails 
of the distributions functions, making them very broad 
indeed. The main new point is that detailed analysis of 
the extreme events in the short time horizon suffices to 
make lots of predictions about the tails of the PDF's, 
predictions that in our case can be easily confirmed by 
considering much longer time horizons. 

In explaining our ideas we will try to distinguish as- 
pects which are general, and that in our view may have 
applications to other systems with extreme events, and 
aspects which are particular to the example of the shell 
model of turbulence. Thus we start in Sect. 2 with an 
analysis of the temporal shape of the extreme events. We 
believe that this analysis is very general, leading to an im- 
portant relation between the amplitude of the event and 
its time scale (the time elapsing from rise up to demise) . 
In Sect. 3 we employ the dynamical scaling form of the 
extreme events to present a theory of the tails of the 
distribution functions. We can relate the tails of PDF's 
belonging to different scales. In Sect. 4 we discuss nu- 
merical studies of the PDF's, distinguishing the core and 
the tails. In Sect. 5 the main numerical findings are ra- 
tionalized theoretically on the basis of universal "pulse" 
solutions of the dynamics of the Sabra model. Section 
6 contains the bottom line: we make use of the scaling 
relations to predict the tails of PDF's from data collected 
within short time horizons. Direct measurements of these 
tails give nonsense unless the time horizons are increased 
a hundred fold. Yet with the help of the theoretical forms 
we can offer predicted tails that agree very well with the 
data collected with much longer time horizons. 



II. DETAILED DYNAMICS AND SCALING OF 
THE EXTREME EVENTS 

In turbulence in general and in our shell model in par- 
ticular the energy that is injected by the forcing at the 
largest scales (n = 1 and 2) is transferred on the average 
to smaller scales. It is advantageous to analyze the ex- 
treme events of a given scale (or given shell n) and also to 
follow the cascade of extreme events from scale to scale. 
We first consider a given shell. 



A. Temporal dependence of extreme events of a 
given scale 

We focus here on the detailed dynamics of the largest 
events of a given scale. We considered for example the 
time series of the 20th shell {n = 20) and isolated the 



5 largest events (in terms of their amplitude) as they 
occurred in a time window of 10^ time steps. In the 
first step of analysis we normalized these 5 events by the 
amplitude at their maximum. Next we plotted these nor- 
malized events as a function of time, subtracting the time 
at which they have reached their maximum value. The 
result of this replotting is shown in Fig. 0. Obviously a 
similar replotting can be done for any time series, and by 
itself is contentless. 




0.002 



FIG. 3. Collapse (of positions and amplitudes) for five in- 
tensive peaks belonging to 20tli shell. The values of Umax for 
the peaks numbered from 1 to 5 are 4.65, 4.77, 6.71, 7.40 and 
10.5 respectively, in units of the rms velocity in this shell. The 
narrowest peak is thus the tallest. 
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FIG. 4. Full collapse (of the position, amplitude and width) 
of the same as in Fig. H peaks. The ordering of the points is 
1 to 5 from left to right. 



The next step of analysis will reveal already something 
interesting. Building on the normalized events of Fig. y 
we attempt to rescale the time axis for each event in order 
to collapse the data together. Of course, each event calls 
for a different rescaling factor, which we denote (in fre- 
quency units) as fr- The fact that such a rescaling factors 
exist, and that they leads to data collapse as shown in 
Fig. y, is a not trivial fact which may or may not exist in 
different cases. But we will show that if such a rescaling 
is found, it can serve as a starting point for very useful 
considerations. 



The third step of the present analysis is a search of 
meaning to the rescaling factors fr- We hope that fr has 
a simple relation to the amplitude of the extreme events. 
To test this we can plot the individual values of fr found 
in Fig. ^ as a function of the aniplitude at the peak. The 
resulting plot is shown as Fig. ^. In passing the straight 
line through the data points we included the point (0, 0) 
in the analysis, as we search for a simple scaling form 
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FIG. 5. Width normalization vs amplitude for 5 peaks col- 
lapsed on Fig. y. 



with X a scaling exponent. We conclude that in this case 
we have a satisfactory scaling law with x = \. 

The meaning of this scaling law is quite apparent in the 
present case. Looking back at the equation of motion we 
realize that from the point of view of power counting (not 
to be confused with actual dynamics) it can be written 
as 



du 



oc u 



\+x 



(7) 



0.015 - 



0.01 



0.005 




with X = \. It is thus acceptable that a rescaling of 1/i 
by Umax should collapse all the extreme events as shown 
above. If the equation of motion were cubic in u we could 
expect X = 1 etc. Obviously, the rescaling analysis in this 
case revealed the type of dynamics underlying the pro- 
cess. Whether this can be done effectively in other case 
where extreme events are crucial is an open question for 
future research. 



time 

FIG. 6. "Evolution" of a peak from the 15th to the 20th 
shell. The amplitudes are all in the same (arbitrary) units. 
One sees a progressive shift of the maximum to the right and 
a decrease in the amplitude, accompanied by narrowing and 
splitting. Nevertheless the form of the central part of the peak 
remains self-similar as exempified in the two following figures. 



B. Transfer of extreme events between different 
scales 

To gain further understanding of the extreme events 
we focus now on the transfer from scale to scale. Con- 
sider for example a particular large amplitude event in 
the shell n = 15, and its future fate as time proceeds. 
This is shown in Fig. ^. The event reached its highest 
amplitude at shell 15 around t — 2.625. At a slightly later 
time it appeared as a large event in shell 16, and with a 
shorter delay at shell 17 where it started to split into a 
doublet. At even shorter delays this event emerges as a 
triplet and a multiplet at shells 18,19 and 20 respectively. 

A very important characteristic of the dynamics of 
large events can be obtained from finding how to relate 
the maximal amplitudes of the first peak in the different 
shells. As was done above, we first replot all the first 
peaks as a function of time minus the time <„ of their 
maximal amplitude Un,max. We then glue all the max- 
ima together by rescaling the peaks amplitudes relative 
to the peak of a chosen shell. Denote by Ks_^{n, m) the 
relative amplitude of the peak in the nth shell to the mth 
shell. Choosing in our example m = 20 we then seek a 
single exponent y such that 



i^am(ri,20) 



</u20,i 



^(20-«)y ^ 



(8) 



where A is the shell spacing defined by Eq. (|l|) . The value 
of y is obtained by plotting .gam('T-) vs (20 — n) where 

5am W = ln[ifam(ri,20)]/lnA = y{20-n) . (9) 

The best fit is obtained with y = 0.24 ± 0.01, see Fig. |. 
The peaks which are now glued at their maxima as shown 
in Fig. M still have very different time-width. 

Next, as before, we want to collapse all these curves 
by rescaling the time axis according to (i — t„) -^ {t — 
tn)/K^{n,2Q). Expecting the scaling law K^{n,2Q) = 
^z(20-n) j|- jg natural to consider 



g^{n) =ln[ii:„(n,20)]/lnA = z(20-n) 



(10) 



The exponent z — 0.75±0.02 is found by computing "the 
best" linear fit of gvj{n) vs (20 — n), see Fig. p. The qual- 
ity of the resulting data collapse can be seen in Fig. ||. 
Note, that within the error bars z + y ~ 1. This sum rule 
will be rationalized theoretically in Sec. ^. 

The bottom line of this analysis can be summarized in 
a dynamical scaling form for the extreme events: 



,(i) « «A-^'7 [{t - t„)«fcoA^ 



(11) 



Here u is a characteristic velocity amplitude associated 
with the cascade of a particular large event which starts 
at small n and reaches eventually large values of n. As 
such V is not universal. We stress that the scaling form 
was derived on the basis of a time series in the short 
time horizon, i.e. the the same one that gave rise to the 
apparent PDF shown in Fig. |l|. We will see that these 



findings suffice to make rather strong predictions about 
the expected form of the converged PDF. A theoretical 
understanding of the origin of the scaling form (O) will 
be presented in Sec. 0. 
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FIG. 7. Collapse of the peak amplitudes for 15-20 shells. 
Initial peaks are shown on Fig. pi 
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FIG. 8. Full self-similar collapse of the peaks for 15-20 
shells. 
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A. Asymptotic Scaling Exponents 

Having a scaling form for the large events means a 
great deal for the structure functions Sp{kn) [cf. Eq. (||)] 
for high values of p. In fact for high p the structure 
functions are dominated b y t he large events. To demon- 
strate this we show in Fig. |lO| the relative contribution to 
Sp{k2o) that arise from velocity amplitudes that exceed a 
threshold w, . In this plot Sp^y^ is the structure function 
Eq. (H) where only events with U2q > w* are considered, 
whereas Sp^ contains all the data. Obviously the higher 
p is the higher is the contribution of large events. For any 
time window there exists the largest event, and when w* 
exceeds its value, Sp^v, necessarily vanishes. 



If we accept the scaling form (|_1|) we can use it to 
predict the scaling exponent ^p for high values of p. By 
definitions 



Sp{kn) = lim — 

T— »oo Zl 



\un\Pdt oz k-^" oc A-"^" . (12) 



FIG. 9. Fits of the rescaling factors gw{n) and gam(?i) for 
the peaks in the shells 15-20 shown in Figs. fMn and k1 
Note that in comparing different shells the rescaling of the 
frequency increases when the peak decreases in amplitude. 
This is opposite to the rescaling of peaks within a given shell. 



For p large enough the structure functions are dominated 
by the well separated events. Instead of the integral in 
the interval [— T, T] we can sum up the inegrals over the 
separated peaks. Substituting for each peak the form 
( |l l| ) and noting that the number of peaks is proportional 
to T, we can extend the integration interval to [— oo, cxd] 
and write 




FIG. 10. Normalized contributions to the structure func- 
tions of orders p = 1, 2 ... 15 for 20th shell from the part of 
the velocity realization with v > v. 



III. IMPLICATIONS FOR THE TAILS OF THE 
PROBABILITY DISTRIBUTION FUNCTIONS 



Sp{k„) ex x-y''p / F{y''tvko)dt 



(13) 



^y^-n(vp+z) fP{^r)d 



Comparing the exponents of A here and in the previous 
equation we find the scaling exponents 



Cp = 2/P + 2 



(14) 



Of course this prediction is valid only for high values of 
p for which the contributions of the isolated peaks are 
domninant. 



B. Tails of the Probability Distribution Function 

We turn now to the prediction of the tails of the PDFs 
assuming that these tails are dominated by well separated 
peaks with self-similar evolution (|l^). We will see below 
[and cf. Eq. dl9)] , that the tails of the predicted PDF are 
very sensitive to the exponents in Eq. ([ll|), but rather in- 
sensitive to the precise form of the universal function f{x) 
in Eq.(p^). Assume then for simphcity that f{x) = l for 
\x\ < 1/2 and f{x) = for |a;| > 1/2. There is the free 
parameter v in Eq. (nw ; for the chaotic realizations u„ (t) 



we consider it as a random parameter. Define then the 
variable V^ according to 



5p(fc„) = / <P„(C/2) df/2 = Cp vl X-iP-^-yp) , (21) 



Tr2 2/2 

V =v /vq. 



n=l 



ul) 



(15) 



Consider now a run with a total time horizon T = 
l/(fcoVo)- Denote as W{V^)dV^ the number of peaks 
measured in this run in which the value of 1^" fell in the 
window [V^,iV^ + dV^)]. 

Next denote normalized amplitudes [the value of the 
signal at times i = i„ in Eg. (pi])] 



U^ = 



V^ 



[<) c\- 



22/ - C2 



(16) 



where C is a dimensionless constant. We are interested 
in the PDF P„(C/^), where PniUl)dUl is the probabihty 
to sample a normalized amplitude in the nth shell be- 
tween C/^ and t/^ + dU^. By definition, the number of 
observations of such amplitudes in the time horizon T is 
dK,, 



dNn=P^{Ul)dUl 



'to 



(17) 



where tq is the length of the sampling intervals. On the 
other hand, since the lifetime of a peak with a given value 
of V^ belonging to the nth shell is l/ufcoA^", we can also 
estimate the number of observations dNn as 



dN„ 



W{V^) 
TowfcoA"^ 



dV 



(18) 



Equating Eqs.([l7|) and dig ) and rearranging, one gets: 

P„(C/2) ^CWiV^)X"^"-'^/V . (19) 

This relation is obtained under the assumption that the 
number of peaks is not increasing in the cascade process. 
In fact we saw that the number of the peaks is increasing 
with the shell number n, presumably in a scale-invariant 
manner as A to some positive exponent p. We can ac- 
count for this effect by replacing in Eq. (|l9|) W by A'^ W. 
After that: 



PniU^J = CT4^(F2)A"("+'3-^)/T/, 



(20) 



where V^ and U^ are defined by Eqs. (nsl) and (16). 
Equation (gO|) means that a collapse of the tails of the 
PDFs for different shells may be achieved by rescaling 
the X-axis [/,^ — > V^ according to ( [l^ ) and rescaling of 
the PDFs (y-axis) by \"'(°'+0~^). 

Equation (EOh for the tail of the PDFs allows one to 
find the high order structure functions (which arc dom- 
inated by the tails of the PDFs) and their scaling expo- 
nents (p-. 







(22) 



Comparing again the exponents of A here and in Eq. (03) 
gives the prediction for the high order scaling exponents: 



(p = yp + z- (3, 



(23) 



which coincides with Eq. (|lj) at /3 = 0. One sees that 
the effect of peak splitting (which was described by pos- 
itive exponent (3) increases the deviation of the scaling 
exponents from its K41 value ^p = p/3. 



IV. NUMERICAL STUDIES OF THE PDF: 
AND TAIL 



CORE 



It is well known that PDF's in multiscaling systems are 
not scale invariant. Nevertheless we need to examinte 
the possibility that the cores of the PDF's can be col- 
lapsed using a rescaling law that is charateristic to them, 
while the tails may be collapsed using another rescaling 
law (with different scaling exponents). This possibility 
is related to the fact that the structure functions Sp{k„) 
have scaling exponents in the vicinity of the K41 values 
(Ck4i (p) ~ -P/3) for p small enough, [say p < 6] . For large 
value ofp (sayp > 12) the p-dependence oiCp has a differ- 
ent slope, cf. Eq (|23|). These differences result from the 
core of PDFs originating from the bulk of the fluctuations 
while the tail of PDFs resulting from the well-separated 
high amplitude peaks. Accordingly the functional form 
of the core and the tail of the PDFs are different. This is 
demonstrated in Fig. O (upper panel) where the PDFs 
for the 11th, 15th and 18th shells are displayed. One 
sees that the cores (say U^ < 20) are practically col- 
lapsed while the tails are widely separated. Needless to 
say, the collapse is due to our choice of display as a func- 
tion of U^: for K41 PDF's such a display would result 
in a complete collapse, core as well as tail. We stress 
though that if one exapnded the scale one could observe 
that the collapse of the core is not precise: the scaling 
exponents even for p = 2 and p = 4 are not 2/3 and 4/3 
respectively. The anomaly of these exponents is however 
sufficiently small to allow an approximate collapse of the 
cores. 
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FIG. 11. Upper panel: PDFs of the 11th, 15th and 18th 
shells (averaged over 10^ time steps). Lower panel: Tails of 
PDFs (with the cores left out) fitted by functions of the form 
ln[P„(L'',i)] — an + bnUn ( continuous lines). 



Our aim here is to test the predictions regarding the 
tails of the PDF's. We note that PDFs that originate 
from data tend to have rather noisy tails. This poses 
difficulties in assessing the accuracy of the collapse of 
the tails. Therefore we opt to first fit the PDFs with 
some appropriate functional form and then to collapse 
the fit functions. As a natural fit function we choose 



6„ and c„ . The results of our fits showed that the parame- 
ters c„ are close to 1/2 for all values of n > 11. Therefore 
we fixed the value c„ — 1/2 and optimized the values of 
of a„ and 6„ to get the best fits in the tail regions. Now 
the fit formula reads 



ln[P„(C/2)]^a„ + 6„[/„ 



(24) 



The corresponding fits for the tails of the PDFs for the 
11th, 15th and 18th shells are shown in Fig. O, lower 
panel. The fits are excellent for U^ > 20 but not sur- 
prisingly they fail for smaller values of U^, especially for 
larger value of n. 

To collapse the tails together we need to choose a refer- 
ence shell n,- ; we show the results for rir = 11. Replotting 
\n[Pn(U^i)] ~ an + ail as a function of b'^Un/b'li one col- 
lapses the tails of all the PDF's on the tail of PDF for 
riy- = 11. This is shown in Fig. 
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The theoretical predictions (16 



2(t) are 



On — ail = {n — ll)(a + l3 — z) In A , 
21n(6„/6ii) = (n- ll)alnA . 



(25) 



According to Eqs. ([Tq ) and the relation y + z = 1 one 
computes a + f] — z = 2 — 3z + (3. We plot now the mea- 
sured (by the best fits) values of (6„ — 6ii)/ln A vs (n — 1). 
Finding best linear fits to the resulting plots we compute 
a — —0.25 ± 0.03. Noticing the independently measured 
values oiy = 0.24 ± 0.01, C2 = 0.72 ± 0.01 we see that 
our value of a is in excellent agreement with Eq. ([l^); 
the latter predicts a = 2y — ^2 ~ 0.24. 
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FIG. 12. Full collapse of the PDF tails of the 11th, 15th 
and 18th shells. Note that in the core region the data does 
not collapse. 



We want next to find the value of f3 from the first of 
Eqs.(g5|). Unfortunately the values of a„ are not com- 
putable with the same accuracy as those of 5„. The rea- 
son for this is that the fit formulae picks up the values 
of the intercepts of Eq.(E4h with much worse precision 
than the slopes. Accordingly the plot (a„ — an)/ In A vs 
(n — 11) is much more scattered than the corresponding 
plot for the slopes, and we can only offer a rough estimate 
of the expected values of /3, 0.2 < /3 < 0.6. 

This rough estimate is not satisfactory, and therefore 
we attempt now to find a sharper result for 13 using 
Eg. (p3|) . In paper g we measured the values of Qp for 
p = 1, 2, 3, ...7. We recognize that these values of p are 
not large enough to determine the asymptotic slope of 
C^p. Nevertheless for a semi-quantitative analysis we can 
use a reasonable fit formula for the ^p-dependence, for 
example: 



p _ 5p{jp~i) 
^^ 3 l-f 7p 



(26) 



With this we find the "best" values of 5 and 7 that agree 
with the measured values of (p-. 5 « 0.092, 7 « 0.725. 
With these values Eq. (p6|) predicts for p — > cxo 



Cp «0.56 + 0.21p 



(27) 



According to the prediction (E3[) the slope of this depen- 
dence is y. The value of y found above from the inter-shell 
collapse of the separated peaks is y = 0.24 ± 0.01, being 
in agreement with the value of y found from the collapse 
of PDF tails. The value y = 0.24 ±0.01 differs a bit from 
the slope in Eq. (|27|). Nevertheless in light of the inac- 
curacy of the measured values Cp for large p (originating 
mostly from the finite extent of the inertial interval), one 
cannot trust the last digits in the numbers of Eq.(27). 



We thus consider the agreement between the estimated 
values of y more than acceptable. 

Thus we will use the intercept in Eq. ( p7| ) to estimate 
(3. Considering Eq. ( p3|) the free term in ( |27| ) has to 
he z — (3. With z « 0.74 we compute /3 « 0.18 which 
is at the borderline of the expected region [0.2,0.6] found 
above from collapsing the PDF tails. Taking then a value 
of /3 w 0.2 allows us to evaluate the number of peaks Nn 
in n shell when there were Nn^i peaks in the previous 
one: 

7V„/iV„_i = A'^ « 1.15, for A = 2, /3 = 0.2 . (28) 

The conclusion is that peak splitting leads (for A = 2 and 
the chosen value of a, 6, c) to a 15% increase of iV„ from 
shell to shell. 

A cursory look at Fig. 6 may leave the impression that 
this is an underestimate. After all, from one peak in shell 
15 the cascade forms four or five peaks in shell 20. A rate 
of increase of 15% would result in a factor of 2, not 5. 
But we must rememeber that we talk about peaks of a 
given amplitude, and the peak splitting results in peaks 
of varying amplitudes. The counting of peaks of compa- 
rable amplitudes is more subtle, and the predicted rate 



of 15% increase should be interpreted in the statistical 
sense, taking many realizations into account. 



V. SELF-SIMILAR SOLUTIONS OF THE SABRA 
SHELL MODEL 

In this section we rationalize the scale-invariant 
form ( [ll| ) on the basis of the equations of motion of the 
Sabra model (||) . The exponent y and the times t„ which 
appear in Eq. ( |ll|) are chosen according to 



y ^l- Z, tn- tn-l = Ay 



(29) 



with an arbitrary positive parameter A\ (note that in ITG] 
there was a salient choice of ^ = 0). These choices are 
not specific for the Sabra model; in Refs. |9[[lCt| identical 
choices were taken the the Obukhov - Novikov (ON) and 
the Gledzer ~ Okhitani - Yamada (GOY) models. The 
fist relation follows from simple power counting, since the 
RHS of the equation of motion for nth shell is propor- 
tional to A". Indeed, we saw that this scaling relation 
is in good agreement with our numerical observations. 
The second choice of (p9| ) reflects the fact the time de- 
lay between the appearance of the peaks in consecutive 
n shells falls off geometrically with n, and see Fig. g as 
an example. Nevertheless we want to show directly that 
these choices are supported by the equations of motion. 
In doing so we follow Ref. [g[. Substituting ( |ll| ) 
and (^9|) in (||) we find the equation of motion of the 
scaling function /(t) which is valid in the inertial inter- 
val: 



df{T) 



St(T) , 



(30) 



dr 

St(T) = A3^-2/*[A^(r - To) + ro]/[A2-(r - To) + tq] (31) 
+ cA2-3V[A-^(t - To) + to]/[A-2-(t - To) + To] 
^ (a + c)/*[A-^(t - To) + To]/ [A^(t - To) + To] . 

To get this equation we changed the time variable from t 
to T„ = X^^^ {t — tn) , and used the same t„ in all the shells 
involved in (^, and finally denoted t„ = t. The charac- 
teristic time To is obtained from computing the sum of 
all time increments X]m=n(^'"+i ~ *™)' ^^"^ noting that 
it converges to io = -^""^to, where 



to 



A" 



'ro; 



TO = A/{X^ - 1) 



(32) 



The meaning of io is the time needed for a pulse to prop- 
agate from the nth shell all the way to infinitely high 
shells. The characteristic time tq allows one to convert 
all the arguments of the functions / involved in (^ij) to 
a universal form [A"(t — tq) -I- to]. 

It was shown in Ref. B] that the Eqs. (pO, pi) can be 
considered as a nonlinear eigenvalue problem. They have 
trivial solutions /(t) = 0, but they may have nonzero so- 
lutions for particular values of z and A. For example, 



the nonzero solution /(r) = const, requires z = 2/3. 
Nevertheless the constant solution fails to fulfill the re- 
quirement that hniT-^ioo /(t) — 0. We expect that a 
nontrivial solution that satisfied the boundary conditions 
will force z into the observed value which lies between 2/3 
to 1. The actual calculations that demonstrate this are 
outside the scope of this paper. We just reiterate our 
numerical finding that z k, 0.75 for the particular set of 
parameters a, &, c and A that were employed in this study. 
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FIG. 13. Panel a: data and analytic fit for the PDF of the 
11th shell in a short time horizon of 10^ time steps. Note 
that here we presented all the events, including four isolated 
events that give rise to the upswingings strings of data points 
with amplitudes larger than 7. Panel b: same as in panel a 
together with the tail (dashed line) predicted from the tail of 
the 18th shell in the same short time horizon. 
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FIG. 14. Test of the predicted PDF for the 11th shell us- 
ing data from a hundred fold longer time horizon of 10^ time 
steps. 



VI. PREDICTING TAILS OF PDF'S FROM DATA 
MEASURED IN SHORT TIME HORIZONS 

In this Section wc demonstrate that the analysis pre- 
sented above can be used to predict the tails of the PDF's 
of large scale phenomena (relatively low values of n) us- 
ing only data measured in the short time horizon. We 
focus on the example shown in Fig. g, i.e. n = 11 with 
10^ times steps. 

We first fit the PDF shown in Fig. g, using a fit formula 
which is inspired by Eq. (p4): 



\n[Pn{i 



(33) 



The 



and found an « 1.34, bn w -4.64., cn w 0.28. 
data and the best fit are shown in Fig.|l3| panel a. 

Next we want to continue the PDF of n = 11 into event 
values that are too rare in the short time horizon. To this 
aim we measured, in the same time window of 10^ time 
steps, the tail of the PDF of the 18th shell. In doing so 
we use the fact that the small scale events have a much 
shorter turn over time, and the "short" time horizon is 
sufficiently long to provide a good estimate of the tail. 
We fitted the tail with Eq. ( P4| ) and found ais ~ —5.3, 
6i8 ~ —0.94. From this value and (Eq. 25) we can pre- 
dict bii. We employ the value a w 0.24 which is taken 
from Eq.(16) with the known value of y (from the in- 
tershcll collapse) and of (2- The resulting prediction is 
611 « -1.72. 

Rather than attempting to also predict an in Eq. (|2J) 
(knowing the inaccuracies of intercepts) we glued the tail 
with the predicted value of 611 to the core PDF function 



10 



( p3| ) by finding the unique point of continuity with same 
first derivative. The way that the predicted tail hangs 
onto the PDF is shown in Fig. |l^ panel b. 

To test the quality of the prediction we ran now the 
simulation for a time horizon that is a hundred times 
longer (i.e 10® time steps). Such a run can resolve the 
events that belong to the tail, and indeed the comparison 
is surprisingly good, as seen in Fig. nj. 



VII. SUMMARY 

The main aims of this paper are twofold: on the one 
hand we aimed at understanding the detailed dynamical 
scaling properties of the largest events in our system. On 
the other hand we wanted to employ these properties to 
predict the probability of these events even in situations 
in which they are very rare. 

The first aim was achieved by focusing on the largest 
events, following their cascade down the the scales (or up 
the shells), and learning how to collapse them on each 
other by rescaling their amplitudes and their time argu- 
ments. This exercise culminated in Eq.(0) which repre- 
sents the largest events u„{t) in terms of a "universal" 
function fir) where r is a properly rescaled time dif- 
ference from the peak time of the event. This dynamical 
scaling form is characterized by two exponents, a "static" 
one denoted y and a "dynamic" one denoted z. We ar- 
gued theoretically for a scaling relation z + y = 1, and 
determined the values of the these exponents on the basis 
of the analysis of isolated events in short time horizons. 

The second aim was accomplished by developing a scal- 
ing theory for the tails of the PDF's in different shells. 
We have learned how to translate information from the 
tail of a PDF in a high shell to the tail of a PDF of 
a low shell. In doing so we made use of the fact the 
high shells (small length scales) have much shorter char- 
acteristic times scales. Thus even short time horizons 
are sufficient to accumulate reliable statistics on the tails 
of the PDF's of high shells. Having a theory to trans- 
late the information to low shells in which the tails are 
extremely sparse (or even totally absent), we could over- 
come the meager statistics. We could present predicted 
tails that were populated only in time horizons that were 
a hundred fold longer than those in which the analysis 
was performed. 

We demonstrated the existence of scaling properties of 
the extreme events that are in distinction from the bulk 
of the fluctuations that make the core of the PDF. In 
this sense the extreme events are outliers. We cannot, on 
the basis of the present work, claim that this approach 
has a general applicability to a large class of physical 



systems in which extreme events are important. We cer- 
tainly made a crucial and explicit use of the scale invari- 
ance of the underlying equation of motion. This scale 
invariance translates here to an intimate connection be- 
tween extreme events appearing on one length scale at 
one time to extreme events appearing on smaller length 
scales at later (and predictable) times (cf. Fig. 6). We 
are pretty confident that similar ideas can (and should) 
be implemented to fiuid turbulence; whether or not such 
techniques will be applicable to broader issues like geo- 
physical phenomena or financial markets is a question 
that we pose to the community at large. 
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